Google Unveils Enhanced Gemini Nano for Pixel 9 Series: Multimodal Capabilities and Hardware Requirements Explained
Posted on May 1, 2024 by Vinoth
Google‘s Pixel 9 series will introduce an upgraded version of Gemini Nano, the company announced at its annual developer conference, aka Google I/O, on Tuesday. The new version will boast multimodal capabilities and can discern context from images, videos, sounds, and spoken language. It requires more powerful hardware, specifically the Neural Processing Units (NPUs), and might not be compatible with older Pixels.
Pixel 9 to bring an improved Gemini Nano
Gemini Nano is a mini version of Google’s generative AI model Gemini (formerly Bard). It is designed to run on smartphones and is the brain behind certain on-device AI features on the Pixel 8 series, including the Pixel 8a. The tool powers the Summarize feature in the Recorder app and Smart Reply in Gboard, among other things. However, its functionality is mostly limited to processing text inputs.
Advertisement
With the Pixel 9 series, Google plans to add new capabilities to Gemini Nano. A teaser posted by the company shows the AI tool explaining the scene in front of the device using its camera. “Hey, what do you think is happening here?” Gemini Nano is asked, pointing to the stage before the Google I/O kicked off. It identified the scene. “It looks like people are setting up for a large event, perhaps a conference or presentation,” it replied.
The tool goes on to converse about the scene with the user, responding to various questions. The demo suggests the upgraded version of Gemini Nano can identify real-world images and discern context accurately. Google plans to use the tool to improve its TalkBack feature to help people with blindness or low vision. It can describe unlabeled photos better than before without requiring an active internet connection.